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Abstract. In this paper we estimate the propagation of liquidity shocks through inter- 
bank markets when the information about the underlying credit network is incomplete. 
We show that techniques such as Maximum Entropy currently used to reconstruct credit 
networks severely underestimate the risk of contagion by assuming a trivial (fully con- 
nected) topology, a type of network structure which can be very different from the one 
empirically observed. We propose an efficient message-passing algorithm to explore the 
space of possible network structures, and show that a correct estimation of the network 
degree of connectedness leads to more reliable estimations for systemic risk. Such algo- 
rithm is also able to produce maximally fragile structures, providing a practical upper 
bound for the risk of contagion when the actual network structure is unknown. We test 
our algorithm on ensembles of synthetic data encoding some features of real financial 
networks (sparsity and heterogeneity), finding that more accurate estimations of risk can 
be achieved. Finally we find that this algorithm can be used to control the amount of 
information regulators need to require from banks in order to sufficiently constrain the 
reconstruction of financial networks. 



1. Introduction 

The estimation of the robustness of a financial network to shocks and crashes is a topic 
of central importance to assess the stability of an economic system. Recent dramatic events 
evidenced the fragility of many economies, supporting the claim that "the worlds financial 
system can collapse like a row of dominoes" [1]. As a result, governments and international 
organizations became increasingly concerned about systemic risk. The banking system is 
thought to be a fundamental channel in the propagation of shocks to the entire economy: 
the economic distress of an insolvent bank can be transmitted to its creditors by interbank 
linkages, thus a shock can easily propagate to the whole network. Unfortunately detailed 
data on banks bilateral exposures is not always available, and institutions are often left 
with the problem of assessing the resilience of a system to financial shocks by exploiting 
an incomplete information set. In this framework the reconstruction of bilateral expo- 
sures becomes a central issue for the estimation of risk, and requires the application of 
sophisticated inference schemes to obtain reliable estimations. Among several methods, a 
commonly used tool for this task is the so called entropy maximization method [2, 3, 4, 5]. 
The main limitation of this procedure is that it assumes a market structure which can 
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be quite different from the actual one: it tends to spread the debt as evenly as possible, 
without assuming any heterogeneity in the structure for the network [6]. Unfortunately 
these assumptions lead to an undervaluation of the extent of contagion, as the measure 
of the vulnerability to financial contagion depends crucially on the pattern of interbank 
linkages. Stress-tests used to quantitatively analyze this dependence confirm this results 
both for simulated and real data, as shown in figures 2, 3 and in Ref. [6]. 
In this paper we will introduce a message-passing algorithm to overcome this limitation, 
and to sample efficiently the space of possible structures for the network. This method 
can be used to propose plausible candidates for the real network structure, and to produce 
worst case scenarios for the spread of financial contagion. We remark that despite the 
high cardinality of the set of possible network structures (~ 2 n2 ) , we are able to generate 
plausible configurations in a time which scales quadratically in the number of unknown 
entries of the liability matrix. 

In section 2 we introduce the main concepts and define the problem of network recon- 
struction, while in 3 we present the Maximum Entropy (ME) algorithm, a commonly used 
procedure to infer credit networks from incomplete datasets. In section 4 we show the 
idea which allows our algorithm to explore the space of network structures and extend the 
validity of ME. Section 5 describes the stress-test which we employ to analyze the robust- 
ness of financial networks, and in section 6 we apply all these ideas to synthetic datasets. 
In section 7 we discuss the reliability of the reconstruction algorithm as a function of the 
policy adopted by regulatory institutions. Finally in section 8 we draw our last conclusions. 

2. Framework 

Let us consider a set of N banks B = {bo, ■ ■ ■ , b^-i}, in which each bank in B may borrow 
to or lend money from other banks in B. This structure is encoded in the so-called liability 
matrix L, an N x N weighted, directed adjacency matrix describing the instantaneous 
state of a credit network. Each element denotes the funds that bank j £ B borrowed 
from bank i € B (regardless of the maturity of the debt). We fix the convention that 
Lij > G B x B, La = Vi 6 B. With this definition, the expression Lj* = J2j Lij 

represents the total credit which the institution i possesses against the system (also known 
as out-strength), while = represents the total debt owed by the institution j to 

the environment (in-strength). 1 This matrix contains information about the instantaneous 
state of a credit network, and it is sufficient to estimate the risk of contagion in many 
cases of practical relevance. Indeed one is often unable to obtain from empirical data the 
complete expression for the matrix L. Data are typically extracted by a bank balance sheets 
or by institutional databases [7], and partial informations have to be coherently integrated 
into a list of plausible liability matrices. In the following discussion, we will suppose that 
three different types of informations about L are available, as typically reported in the 
literature [8]: 



Without loss of generality we consider a closed economy L~* = . L*j~), by using bank bo as a 
placeholder to take into account flows of money external to the system. 
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(1) All the debts larger than a certain threshold 6 are known. This allows us to rescale 
all the elements of L by 9, so that we consider without loss of generality liability 
matrices for which all the unknown elements are bound to be in the interval [0,1]. 
We assume to have at most order TV" elements exceeding such threshold. 

(2) We assume a certain set of entries (which we take to be of order N) to be known. 
This corresponds to banks or bank sectors for which some particular position needs 
to be disclosed by law. 

(3) The total credit and the total debit Lj~ of each bank are known. Acceptable 
candidates for liability matrices need to satisfy a set of 2N linear constraints, whose 
rank is in general 1Z < 2N — 1 (due to the closed economy condition). 

We remark that we have defined a set of constraints of order N elements, which is too 
small to single out a unique candidate for the true unknown liability matrix. The possible 
solutions compatible with the observations define a space A, whose members we denote 
with L. Let U be the set of not directly known (i.e. non-fixed by to constraints of type (1) 
and (2)) entries of the liabilities matrix. Then those entries of the liability matrix (whose 
number is M = \U\) are real numbers subject to domain constraints (they must be in 
[0, 1]) and linear algebraic constraints (the sum on the rows and on the columns must be 
respected). The ratio M/1Z > 1 controls the degree of underdetermination of the network, 
and is typically much larger than one. 



3. Dense reconstruction 

A possible procedure to study the robustness of a financial network when the complete 
information about the liability matrix is not uniquely specified, is to pick from the set of 
candidate matrices A a representative matrix, and to test the stability uniquely for the 
network specified by such L. In this case a criterion has to be chosen to select a particular 
matrix out of the A space, by doing some assumptions about the structure of the true 
Lij. A choice which is commonly adopted [2, 3, 4, 5] is based on the maximum entropy 
criteria, which assumes that banks spread their lending as evenly as possible. The problem 
becomes in this case that of finding a vector L = {L a } a( zu (the unknown entries of the 
liability matrix) whose entries satisfy the algebraic and domain constraints and minimize 
the distance with the uniform vector Q = {Q a } a eu (such that Va Q a = 1), where the 
distance is quantified by the Kullback-Leibler divergence 

D KL (L, Q) = ^2 L a log • 

The minimization of such function is a standard convex optimization problem, that can 
be solved efficiently in polynomial time. In financial literature this algorithm is known 
with the name of Maximal Entropy (ME) reconstruction. We remark that by using this 
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algorithm no entry is exactly put to zero unless it is forced by the algebraic constraints. 2 



4. Sparse reconstruction 

ME might not be a particularly good description of reality since the number of coun- 
terparties of a bank is expected to be limited and much smaller than N, while ME tends 
to produce completely connected structures. In the case of real networks the degree of 
market concentration can be higher than suggested by ME. This systematically leads to 
an underestimation of risk, as a structure in which the debt is distributed homogeneously 
among the nodes is generally known to be able to absorb shocks more effectively than a 
system in which few nodes dominate the network [6]. In order to be closer to reality and 
to estimate more accurately the risk contagion it is then necessary to reconstruct liability 
matrices whose degree of sparsity (i.e. the fraction of zero entries of L) can be tuned, and 
eventually taken to be as big as possible. This corresponds to the choice of topologies 
for the interbank networks in which the number of links can be explicitly regulated by 
means of a control parameter. We present in this section an algorithm which, given the 
fraction A of entries which are expected to be exactly zero, is able to reconstruct a sample 
of network structures compatible with this requirement, and to find a X max which bounds 
the maximum possible degree of sparsity. We focus the discussion on the generic case in 
which topological properties of the original credit network such as the sparsity parameter 
A or the number of counter parties of each bank are not known, without imposing any 
specific type of null model. The purpose of the algorithm is to provide an efficient mean to 
explore the space A, and to illustrate how the result of the stress-testing procedures may 
vary according to the density of zeroes of the matrix L which is assumed. 



To be more specific, let us define the notion of support of a liability matrix as follows: 
given an N x N weighted, directed adjacency matrix L, we define its support a G {0, 1}^ 
as the N x N adjacency matrix such that 



a- (L) = i 1 if Lij > ° 
' \ otherwise 



The sparsity A associated with a specific network structure a is defined as A{ajj} = 
1 — ij a i,j)/N(N — 1). Finally, given a network structure a and a set of liability matrices 

A, we say that a is compatible with A if there exists at least a matrix L G A such that 
ciij(L) = a,ij. Now we consider a liability matrix L which is partially unknown in the sense 
of section 2, and address the following issues: (i) is it possible to fix a fraction A of the 
unknown entries to zero without violating the domain and the algebraic constraints? More 



2 This algorithm is not the only possible choice to extract a representative matrix out from the set A. 
Indeed existing algorithms share with the ME the property of returning solutions located in the interior of 
A. On the other hand, when choosing a point at random in a compact set in very high dimension d, it is 
very likely that the point will be very close to the boundary (i.e. at a distance of order 1/d). Hence, it is 
reasonable to expect that typical feasible liability matrices are located on or close to the boundaries of A. 
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Figure 1. Entropy S of the space of compatible configurations aj j at fixed 
sparsity A with the energy 7i(aij) (+ sign) and true energy T^o(flij) ( x sign) 
for the examples discussed in the text. S is defined as the logarithm of the 
number of configurations {flij} with H = (or Ho), divided by the number 
M of possibly non-zero entries ctjj. The solid line plotted for comparison is 
the entropy of a system of independent links a^j with the same density (i.e. 
number of non-zero links). The probability for a solution of Ho(aij) to be 
also a solution of H(aij) is also plotted on the same graph (dashed line). 

formally, this corresponds to ask whether it exists a matrix L £ A such that A = A(a(L)). 
More generally, (ii) how many supports a with fixed sparsity A are compatible with A? 
The algorithm solves this problems by sampling from the space of all compatible supports 
o(A) potential candidates whose degree of sparsity is constrained to be A, and by eval- 
uating the volume of such support sub-space. As one can easily expect, there will be a 
range of [A m j n , X max ] of fractions of fixed zeros compatible with the constraints: trivially 
Amin = corresponds to the dense network, which always admits a compatible solution, 
but we are able to find a non-trivial X ma x which corresponds to the maximally sparse 
network of banks. A plot of the logarithm of the number of possible supports as a func- 
tion of A is given in figure 1 (x signs) for a network as the ones described in section 6. 
Once a support is given, the liability matrix elements can easily be reconstructed via ME. 3 



3 As shown in figure 2 and 3, ME tends to underestimate the risk of contagion (see footnote 2) even in 
the case in which the true support a(L) is known, thus suggesting that other reconstruction algorithms 
should be employed for the estimation of the non-zero entries of a partially known liability matrix. Indeed, 
it is clear from those same simulations that inferring the support corresponding to the original network is 
a significant first step towards a more correct estimation of the risk of contagion. 
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The algorithm that we use to sample the candidate network structures a employs a 
message-passing technique which is able to overcome the problem of explicitly inspecting 
the compatibility of each network. The main idea is that we want to associate to each 
adjacency matrix a a sampling probability Po{ a i j}i that is strictly zero for non-compatible 
supports and is otherwise finite. Sampling uniformly from the space of compatible supports 
would correspond to the choice that Po{fljj} = Vl a (-A-)l iff a i,j e a (A) an d zero otherwise. 
Indeed, to fix the required degree of sparsity of the network A one can consider the modified 
sampling probability 

p/ , _[_/ :>- ," if a € a(A) 
niOijt - z { o otherwise ' 

where Z is a normalization constant and the fugacity z controls the average degree of 
sparsity of the sampled network, and is fixed in order to recover A = ^ a Po{a,ij} ~~ 
dij)- The variable logz is analogous to a chemical potential in physics, in the sense that it 
is used to select denser or sparser sub-graphs (i.e. tuning the A parameter). The probability 
distribution Po{°i,j} can also be seen as the (3 — > oo limit of 

(1) PoKi} = le-^oK^E^ ( 

where we introduce the formal cost function ^o{°i,j} which vanishes for a € a(A) and 
is 1 otherwise. Probability distributions of the form (1) are typically hard to compute 
explicitly due to the presence of the normalization constant Z, but their approximate 
marginals can be estimated efficiently (i.e. within a time scaling linearly in the number of 
unknown variables M) by means of the iterative algorithms such as the one described in 
the appendix. The solution that one obtains for the marginals 

( 2 ) PoiJ = ^2 P o{ai,j}S( a i,j = 1) 

a 

corresponds to the probability that the entry a^j is equal to one in the ensemble of network 
structures which (i) are compatible with A and (ii) have an average degree of sparsity A. 
Being able to compute those marginals allow to sample efficiently the space of solution by 
employing procedures such as the decimation one described in the appendix, in which at 
each step the most biased variable is fixed to Oj,- = 1 with probability pij, and a 
reduced problem in which such aij is held fixed is successively solved. Once all variables 
are fixed, an adjacency matrix dij is selected out of the space of solutions and can be used 
as a candidate network structure. 

Unfortunately, the energy function T-Lq is hard to manipulate, and we need to resort to 
an approximate energy function 7~L, whose structure is derived in the following paragraph. 
Suppose that a liability matrix with unknown entries is given, together with of the vectors 
of total credit (Lj*) and the one of total liabilities (L^~). Then without loss of generality 
one can assume the known entries to be equal to zero, as the values of the known entries 
can always be absorbed into a rescaled value of the Lj' and L^ - , and the problem can be 
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restricted just to the unknown entries of the matrix. Under this assumption we can define 
the set of banks B C B which are linked to the unknown entries of the liability matrix. 
Each node of B is a bank and the directed edges are the elements of U. For each node i of 
B the sum of the incoming entries L~* = J2j Lij and of the outgoing entries L*^ = £\ Lij 
is known. Let (k~*) be the number of incoming (outgoing) links in the subset of edges 
where Lij > 0. Since L%j < 1, the number k'f (k^) of incoming (outgoing) links is at 
least the integer part of L^~ (L%~) plus one. Therefore, one can define a cost function 4 

(3) Ufa} = ^> {L? ~ K) + (Lt - 

i 

over the dynamical variables Gtj » = 0, 1 which identify the subset of edges, with 

k i = y ^ fej = y ] ■ 

Then we can construct the probability function 

(4) P{aij} = le-^K^^ °^ 

which we employ to sample the space of candidate network structures. Notice that all 
sub-graphs dij with % = are feasible candidates for the support of solutions Ljj > 
to the problem. In general, the constraints are 2N linear equations and, as long as the 
number on non-zero elements Li j is larger than 2N solutions exist, but it is not granted 
that they have Li j 6 [0, 1] for al i,j. In other words, all the compatible solutions have to 
satisfy the constraint H = 0, but the converse is not true (as shown in figure 1), because 
some support Oj j may not admit a solution with Li j G [0, 1] for al Equivalently, the 
cost function 7^o{°ij} involves constraints that the approximate % is not able to capture. 
Message passing algorithms can be derived along the lines of Refs. [9, 10] to solve efficiently 
the problem of sampling the space of solutions of (3) as described in detail in appendix. 
In particular we propose a generalization the algorithm employed in Ref. [9], in which we 
consider hard constraints enforced by inequalities rather than equalities and add a fugacity 
parameter z in order to control the density of links of the solutions. 

5. FURFINE STRESS-TEST 

The aim of this section is to show that some measures of vulnerability of a banking 
system to financial contagion, also known with the name of stress-tests, are sensitive to the 
way in which the liability matrix is reconstructed. In particular the dense ME reconstruc- 
tion typically underestimates the risk of contagion, while more realistic results are found 
if one employs a sparsification parameter A controlling the density of links in a financial 
system. 

A widely used measure of vulnerability in financial literature is the stress-test introduced 
by Furfine [11], which is a sequential algorithm to simulate contagion. Suppose that the 



Here 8(x) = for x < and 0(x) = 1 otherwise is the Heaviside step function. 
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liability matrix L is given and let us define C z the initial capital of a bank z in the system 
B. The idea of the algorithm is simple: suppose that a bank z of the ensemble B fails 
due to exogenous reasons. Then it is assumed that any bank i G B loses a quantity of 
money equal to its exposure versus z (Li z ) multiplied by an exogenously given parameter 
a G [0, 1] for loss-given-default. Then if the loss of the bank i exceeds its capital Ci, bank 
i fails. This procedure is then iterated until no more banks fail, and the total number of 
defaults is recorded. 

The procedure described above can be formally rephrased in the following steps: 

Step 0: A bank z G B fails for external reasons. Let us define Dq = {z}, Sq = B\{z}. 
For the banks i G So we set Cf = Ci. 

Step t: The capital C* _1 at step t — 1 of banks i G St-i is updated according to 

C\ = C\~ x - a L v 

with q £ [0,1]. A bank i G St-i fails at time t if C\ < 0. Let us define D t , the ensemble 
of all the banks i G St-i that failed at time t and St = St-i\Dt the ensemble of banks 
survived at step t. 

Step t s t op : The algorithm stops at time t s t op such that Df st = 0- 

We remark that the capital Ci of each bank is exogenously given, and in principle it is 
not linked to the liability matrix L. The same holds for a, so that the result of a stress- 
test is understood as a curve quantifying the number of defaults as a function of the a 
parameter. Finally, the results of the stress-test depend on the first bank z G B which 
defaults. Then one may choose either to consider the results of the stress-test dependent 
on the z which has been chosen or to average the outcome on all the banks in the system 
£>; we adopt this second type of measure, and consider the default of all the banks to be 
equally likely. 

6. Application to synthetic data 

In this section we will show how our algorithm of reconstruction of the liability matrix 
Lij (presented in section 4) gives more realistic stress-test results if compared with ME 
reconstruction algorithm (presented in section 3). 

We choose to present the results obtained for specific ensembles of artificial matrices, whose 
structure should capture some of the relevant features of real credit networks 5 . The first 
case that we analyze is the simplest possible network with a non-trivial topology, namely 

5 Our attempts to obtain data on real financial networks, such as those in Refs. [6, 7], from central 
banks were unsuccessful. We focus on ensembles of homogeneous networks (i.e. non-scale free). This is 
appropriate since the unknown part of the financial network concerns small liabilities, and there is no a 
priory reason to assume a particularly skewed distribution of degrees for the unknown part of the financial 
newtork. 
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the one in which every entrance of the liability matrix Lij with i ^ j is set to zero with 
probability A, and otherwise is a random number uniformly chosen in [0, 1]. We set the 
banks initial capital C, to random numbers uniformly chosen in [C m i n ,C max ]. We impose 
the threshold 6 = 1, which means that all the entrance of the liability matrix are unknown 
(a worst-case scenario). We then reconstruct the liability matrix via ME algorithm and 
via our algorithm trying to fix the fraction A of zeroes equal to A. Then we stress-test 
via the Furfine algorithm the three liability matrices: the true one, the one reconstructed 
via ME algorithm one and the reconstructed by means of our message-passing algorithm, 
varying the loss- given-default a in [0, 1]. The results of our simulations are shown in figure 
2. We clearly show that the ME algorithm underestimates the risk of contagion, while 
more realistic results are obtained if the original degree of sparsity A is assumed. 

Notice that even when the degree of sparsity is correctly estimated, stress tests on the 
reconstructed matrix still underestimate systemic risk. This is because the weights L a on 
the reconstructed sub- graph are assigned again using the ME algorithm. This by itself 
produces an assignment of weights which is much more uniform than a random assignment 
of on the sub-graph, which satisfies the constraints (see footnote 2). As a result, the 
propagation of risk is much reduced in the ME solution. 

The second ensemble that we consider is a simple extension of the first one, in which 
the only modification that we have introduced implements heterogeneity in the size of the 
liabilities L^. In particular we consider matrix elements distributed according to 

p(L ii )~(6 + L ij -)-' i - 1 . 

Also in this case we can show (figure 3) that a more accurate estimation of the default 
probability is achieved by enforcing the sparsity parameter of the reconstructed network to 
be the correct one. In this case the maximally sparse curve is less informative than in the 
uniform case. This is easily understood as due to the fact that the typical element Lij ~ 
10~ 2 is much smaller than the threshold 6 = 1, so that a number of zero entries substantially 
larger than the original one can be fixed without violating the hard constraints. 

In both cases, when the true sparsity of the network is unknown, focusing on the sparsest 
possible graph likely over-estimates systemic cascades, thereby providing a more conserva- 
tive measure for systemic risk than the one obtained by employing ME alone. 

7. The role of the threshold 

In the discussion above we disregarded the role of the threshold 6 above which an expo- 
sure L^ has to be made publicly available to regulators by setting it equal to 1 . Indeed the 
problem of setting such threshold is a central problem to build a regulatory policy, hence 
the discussion of the reliability of the reconstruction algorithm varying 6 while keeping 
fixed the true L is in practice particularly relevant. An appropriate way to address this 
issue is the following: given a network ensemble (such as the ones described in previous 
section) and a threshold 6, how many network structures are there with a compatible sup- 
port? In particular, we remark that among all such compatible supports the maximally 
sparse one can be used to bound from above the maximum amount of risk given a policy 
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Figure 2. Plot of mean fraction of failed banks vs loss-given-default pa- 
rameter a. The mean is done by averaging over the defaulting bank which 
starts the contagion. Results are obtained by considering: true liability 
matrix (solid line), reconstructed via ME algorithm liability matrix (thick 
dashed line) and the maximally sparse matrix (+ signs). Plots were obtained 
for a network of N = 50 banks with entries uniform in [0,1], where the link 
probability was fixed to 0.7 and the initial capital was set to C% = C = 0.3. 
One can easily see that a better estimation of the true risk of contagion is 
obtained if the reconstruction of the liability matrix is done by enforcing 
the correct sparsity of the network rather than with the ME algorithm: the 
results obtained by putting the correct support (soft dashed line), corre- 
sponding to the original network structure a(Ljj), are also plotted, as well 
as the ones obtained by using a typical support (x signs), corresponding 
to the choice of a random, compatible support ajj whose degree of sparsity 
matches the one of the original network. Errors bars refer to the fluctua- 
tions of the default ratio associated with the choice of a specific support out 
of the ensemble the compatible ones at fixed degree of sparsity. 



for the thresholding. In particular for each value of 9, we empirically find that A ma;c [#] 
enjoys the following properties: 

(1) The maximum sparsity Xm ax (9) is a decreasing function of 9. In particular for 
9 — > one has \ m ax{9) — > A; 

(2) The entropy S(X(9)) — > when the threshold goes to 0. 

An example of this behavior for an ensemble of networks with power-law distributed weights 
is represented in figure 4, while in 5 we plot the entropy S(X max ) structures as a function 
of 9. Therefore the algorithm described in section 4 provides quantitative measures for 
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Figure 3 . A plot analogous to the one in figure 2 for the case of power-law 
distributed entries of the liability matrix. This plots was obtained for a 
network of size N = 50, where the link probability was fixed to 1/2. The 
parameters for the distribution of the entries were set to b = 0.01 and fj, = 2, 
while the capital of each bank was fixed to C% = C = 0.02. 

the uncertainty induced by the choice of a given threshold on network reconstruction. 
Ideally should be chosen so that maximally sparse structures are close to the true ones, 
and that the space of compatible structures is not too large (small entropy). 

8. Conclusions 

We have shown how it is possible to estimate the robustness of a financial network to 
exogenous crashes by using partial information. We confirm [6] that systemic risk measures 
depend crucially on the topological properties of the underlying network, and we show 
that the number of links in a credit network controls in a critical manner its resilience: 
connected networks tend to absorb the response to external shocks more homogeneously 
than sparse ones. We have also proposed an efficient message-passing algorithm for the 
reconstruction of the topology of partially unknown credit networks, in order to estimate 
with more accuracy their robustness. Such algorithm allows (i) to sample the space of 
possible network structures, which is assumed to be trivial in Maximal Entropy algorithms 
commonly employed for network reconstruction, and (ii) to produce typical credit networks, 
respecting the topological constraint on the total number of links. Finally, we test our 
algorithms on ensembles of synthetic credit networks which incorporate some of the main 
features of real credit networks (sparsity and heterogeneity) , and find that the quality of the 
stress-test when only partial information is available critically depends on the assumptions 
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Figure 4. We plot the entropy of the space of compatible distributions 
(i.e. of the solutions of ~H{aij}) as a function of the sparsity parameter 
A by varying the threshold 6 from 1 (top curve) to 0.01 (bottom curve). 
The dashed line signals the transition point where solutions cease to exist. 
We consider power-law distributed entries for the true network {D = 30, 
A ~ 0.3, b = 0.01 and \i = 2). This shows how the volume of the space 
is reduced by a change of the threshold and how \ ma x gets closer to A by 
lowering 9. 



Power Law case 
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FIGURE 5. The entropy of the space of solutions H{aij} as a function of 
the threshold for the same network as the one depicted in figure 4. 
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which are done about the network topology. In particular, we find that ME underestimates 
the risk of contagion if the sparsity of the real ensemble is big enough, while our algorithm 
provides less biased estimates. We remark that a worst case analysis of the topology is 
possible using the proposed algorithm, as we are able to produce the maximally sparse 
(hence, maximally fragile) possible structure for the network. Further developments of 
this work are indeed possible, in particular the identification and the reconstruction of 
other relevant topological features of credit networks would be relevant for a more accurate 
estimation of the contagion risk. 
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A similar problem of deriving network ensembles satisfying given constraints has been recently ad- 
dressed in Ref. [13]. However, Ref. [13] focuses on a different problem, which is that of computing efficiently 
expected values of network properties in maximum entropy ensembles of networks with the same expected 
degree sequence of a given graph. Here we focus on the problem of deriving ensembles of partially observed 
networks. We mention, in passing, that ensemble properties, including local ones, can be very efficiently 
computed within our framework from the fixed point of the message passing equations for the marginals 
Ha-th: in spite of the fact that our constraints are more complex, as they involve inequalities in the degrees. 
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Appendix A. Message-passing algorithm 

We describe here the algorithm which we use to sample the solution space of the energy 
function 

«K,-} = E 1 9 ^ - k ?) + 6 ^ - k t)\ 

which we derived along the line of [9]. Specifically, given as an input an incomplete liability 
matrix, whose information is encoded into a set of N in-strength L%~, N out-strength L~* 
and a set of U unknown entries of cardinality M = \U\), we provide an algorithm which for 
any positive value of the fugacity z returns an adjacency matrices ajj sampled according 
to the probability distribution (4) 

P{a id } = -e-^l^il^i^j . 

The procedure that we describe can further be separated in two main tasks: (i) given a 
probability distribution of the form (4), finding an efficient mean to calculate marginals 
Pi j defined analogously to (2) and (ii) given a fast algorithm to calculate marginals, using 
them to find an adjacency matrix Ojj distributed according to P{aij}. 

A.l. Calculation of the marginals. The structure of the problem admits a graphical 
representation as a factor graph, in which \U\ variable nodes are associated to the aij 
degrees of freedom, while the constraints are represented as factor nodes. In particular, 
there are 2N function nodes, labeled a € {i <— i, i = 1, • • • , N} each with k a variable 
nodes attached. Let the variables be denoted x a ^ = Xb,a = 0, 1 with a, b and let da be the 
set of neighbors of node a. Let M = ^ ^ a \da\ be the total number of variables. For each 
variable x a ^ we define the message n a ->-b as the reduced marginal 

fJ-a-^b = y^P{Xg,b\ fi}S(x at b = 1) , 

x 

where P{x a ^\ /&} denotes the restriction of the probability measure (4) to a problem in 
which the function node b is absent. Such messages need to fulfill self-consistent relations 
(BP equations) [12] which can be written in terms of the statistical weights' 

V sUa = E II II C 1 - ^c-+a) 

ueS:\u\=m.beu ces\u 



'''Since k a can be as large as N, the direct computation of Vs% a involved in principle 2 fe " terms, which 
may be very large. A faster way to compute it is to use the recursion relation 

Vst+a = (1 - Hb-+a)Vs\ b ^. a + flb^aV^Xa, V& G S . 

In practice this allows one to build Vg\ a adding one at a time the nodes in S. This procedure involves of 
order m 2 < fc 2 operations. 



RECONSTRUCTION OF FINANCIAL NETWORKS FOR ROBUST ESTIMATION OF SYSTEMIC RISK15 

and they read 

Ek a — 1 m+lT/m 
m=L a -l z v da\b^a 



(5) fla^b 



Eka — 1 r m+lT/m I V^fca — 1 m \/m 
m=L a -l z v da\b^a ~r l^m=L a z v da\b^a 

yt\bXa + zW da\b^a 



(6) w da \ b ^ a = Yl z m - L «vr 



Vt\b\a + (! + 

_m-L aT , 

da\b— >a " 



m=L a 

Here z is the fugacity of links, and controls the average degree of sparsity A of the supports 
in the solution space. For z — > we obtain the equation for the sparsest possible graph 

yL a -i 

da\b^a 
V da\b^a ~ v da\b^a 

whereas for z — > oo we recover the maximally connected graph fj, a ^b = 1 for all a and 
b£ da. 

Once the fixed point of Eqs. (5,6) is found by iteration, for a given z, one can compute 
the marginals 

_ l J -a—>b[ J 'b—Hi 

+ (1 - |U a _ ) . 6 )(l - fJ, b ^. a ) 

that link (a, b) is present, and the entropy 

S ( Z ) = Yl l0 S Yl Vfkir+a ~ 2 ^ ^ l0§ ^a-^b^a + (1 ~ Ma->&)(1 ~ Mfe-m)] 
a m=L a a b£da 

To plot the number of solutions (or of different supports) as a function of the sparsity 
parameter A, and the associated entropy £(A) one should use the fact that: 

e MS(z) = f d ^ e MS(X)+M(l-X)\ogz 







and hence perform the back-Legendre transform. 

A. 2. Decimation. We describe in the following a decimation procedure to generate the 
configurations a^j once that the problem of computing marginals pij is controlled. For 
simplicity we choose to present a simple version of the algorithm, while more detailed de- 
scription of this procedure and a discussion of its efficient variants can be found in reference 
[12]. 

Step 0: Define the set = U, and the in- and out-strengths = L a . The can- 

didate network structure is defined as af*j = if (i,j) € U and afj = a(Lij) otherwise. 

Step t + 1: Find the marginals pfj corresponding of the probability distribution P®{<H,j} 
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associated to the reduced problem denned by the incomplete matrix of unknown entries 
and in- and out-strengths L a . Select the most biased variable = argmin^ j\ e jj(t) mm\pfj , 1- 

pfj] and set: 



= 1 withprob. pi*j* 

jj(t+i) = tf(*)\(i* j<7 "*) 

r->(*+l) r ->(t) n 

r <-(*+!) r •«-(*) „ 



Step The algorithm stops at time t s t op such that JJ^ tatop ^ = 0. The candidate sup- 

port dij = afj top ^ so-obtained is distributed according to the probability distribution (4). 
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